Effects of Sparse Initialization in Deep Belief Networks
نویسندگان
چکیده
منابع مشابه
Sparse Feature Learning for Deep Belief Networks
Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others a...
متن کاملOn weight initialization in deep neural networks
A proper initialization of the weights in a neural network is critical to its convergence. Current insights into weight initialization come primarily from linear activation functions. In this paper, I develop a theory for weight initializations with non-linear activations. First, I derive a general weight initialization strategy for any neural network using activation functions differentiable a...
متن کاملInitialization for the Method of Conditioning in Bayesian Belief Networks
Suermondt, H.J. and G.F. Cooper, Initialization for the method of conditioning in Bayesian belief networks (Research Note), Artificial Intelligence 50 (1991) 83-94. The method of conditioning allows us to use Pearl's probabilistic-inference algorithm in multiply connected belief networks by instantiating a subset of the nodes in the network, the loop cutset. To use the method of conditioning, w...
متن کاملStudies in Deep Belief Networks
Deep networks are able to learn good representations of unlabelled data via a greedy layer-wise approach to training. One challenge arises in choosing the layer types to use, whether it is an autoencoder, restricted boltzmann machine, with and without sparsity regularization. The layer choice directly affects the type of representations learned. In this paper, we examine sparse autoencoders and...
متن کاملDeep Belief Networks
The important aspect of this layer-wise training procedure is that, provided the number of features per layer does not decrease, [6] showed that each extra layer increases a variational lower bound on the log probability of data. So layer-by-layer training can be repeated several times1 to learn a deep, hierarchical model in which each layer of features captures strong high-order correlations b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Science
سال: 2015
ISSN: 1508-2806
DOI: 10.7494/csci.2015.16.4.313